Members
Overall Objectives
Research Program
Application Domains
Software and Platforms
New Results
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Data interlinking

The web of data uses semantic web technologies to publish data on the web in such a way that they can be interpreted and connected together. It is thus critical to be able to establish links between these data, both for the web of data and for the semantic web that it contributes to feed.

Keys and pseudo-keys detection for web datasets cleansing and interlinking

We have proposed a method for analysing web datasets based on key dependencies. For this purpose, we have adapted the classical notion of a key in relational databases to the case of RDF datasets [9] , [16] . In order to better deal with web data of variable quality, we have introduced the definition of a pseudo-key. We have also provided an RDF vocabulary for representing keys and pseudo-keys and designed and implemented an algorithm for discovering them. Experimental results show that, even for a large dataset such as DBpedia, the runtime of the algorithm is still reasonable. We are currently working on two applications: data cleansing, i.e., detection of errors in RDF datasets and recovery, and datasets interlinking.

The algorithm is publicly available at https://gforge.inria.fr/projects/melinda/ .

Data interlinking from expressive alignments

In the context of the Datalift project (see § 8.1.1 ), we are developing a data interlinking module. Based on our analysis of the relationships between ontology matching and data interlinking [15] , our goal is to generate data interlinking scripts from ontology alignments. For that purpose, we have integrated existing technologies within the Datalift platform: the Alignment api , for taking advantage of the edoal language and Silk , developed by Frei Universtität Berlin, for processing linking scripts. So far, we have generated Silk script from ontology alignments in order to produce links.

This work is part of the PhD of Zhengjie Fan, co-supervised with François Scharffe (lirmm ).